Rules for how to write hamza in the Arabic script
Preface
The Arabic fonts in this article may not render correctly in the Chrome and Chromium browsers. In this case please use Firefox or Safari instead.
بسم الله الرحمن الرحيم
الحمد لله والصلاة والسلام على نبينا محمد. أما بعد:
1 Introduction
The rules of how to write hamza in the Arabic script are quite complex. The matter is further complicated by the limitations of metal and digital font technology. We attempt to give a comprehensive set of rules for writing hamza and some recommendations for typographers and typesetters.
This article is based on findings from the author’s research. Any errors reported as issues at https://github.com/adamiturabi/hamza-rules will be greatly appreciated.
2 Scope
We treat only the orthography of Standard Arabic. The orthography of the Qurʾān (الرسم العثماني) is not discussed.
3 Hamza orthography
3.1 Different ways hamza is written
hamza is written in four different ways:
- Seated on an ʾalif: أ or إ
- Seated on an wāw: ؤ
- Seated on an yāʾ: ئ
- Unseated: ء
Here are some of notes about writing hamza in the above four methods:
When unseated hamza comes between two letters that are joined, then it is written above the line that joins them, for example: خَطِيءَة k͡haṭīʾaḧ. In this word, the yāʾ ي joins to the tāʾ marbūṭaḧ ة.
When unseated hamza is followed by an ʾalif: ءا, the combination of hamza and ʾalif is usually written as آ as a convention. Examples: آمَنَ ʾāmana, ظَمْآن ḍ͡hamʾān, شَنَآن s͡hanaʾān. However, when the ʾalif is a suffix or part of a suffix, or the hamza is doubled, or there is an ʾalif before the hamza then we will write ءا, not آ. Examples: شَيْءَانِ s͡hayʾāni, سَءَّال saʾʾāl, قِرَاءَات qirāʾāt.
When hamza is seated on ʾalif, if it has an kasraḧ, it is written below the ʾalif: إِ. Otherwise, it is written above the ʾalif: أَ, أُ, أْ.
When hamza is seated on yāʾ ئ the dots of the yāʾ are no longer written. Here’s how it will appear in different positions:
Isolated End Middle Beginnning ئ ـئ ـئـ ئـ Note that hamza is seated on yāʾ in the middle position ـئـ is different from unseated hamza between two joining letters ـءـ.
So how do we know when to write hamza unseated and when seated? And how do we choose between its three different seats? There are a set of rules that we need to follow in order to correctly write hamza. Before we give the rules we will first present the underlying principle behind the rules.
3.2 Rules for determining the seat of hamza
3.2.1 Separate main word from prefixes and suffixes
In order to determine the seat of hamza for a words, we must first separate the main word from any prefixes and suffixes. We will determine the seat of hamza for the main word first. Hamza can occur in three positions in the main word:
- At the beginning of the word
- In the middle of the word
- At the end of the word
We will treat each of these positions below.
3.2.1.1 At the beginning of the word
When hamza occurs in the beginning of a word, then:
- If the hamza carries a long-ā vowel, it is written unseated followed by an ʾalif and written as آ, for example آمَنَ ʾāmana.
- If the hamza carries any other vowel, it is written seated on an ʾalif, and is marked with the appropriated vowel mark, for example أَسْلَمَ ʾaslama, أُرِيدُ ʾurīdu, إِسْلَام ʾislām, إِيمَان ʾīmān, أُوخِذَ ʾūk͡hid͡ha.
3.2.1.2 In the middle of the word
The most general case is when hamza is in the middle of a word.
Arabic has three short vowels, three long vowels, two diphthongs, and a sukūn. Each of these has an order of precedence and a hamza seat.
| Precedence | Vowel | Seat | |
|---|---|---|---|
| 1. | ī/ay | ء | |
| 2. | i | ئ | |
| 3. | ū/aw | ء | |
| 4. | u | ؤ | |
| 5. | ā | ء | |
| 6. | a | أ | |
| 7. | ◌ْ | ء |
Main rule: Consider the vowel on the consonant before the hamza and the shortened vowel on the hamza itself. Determine which of the two vowels is higher in precedence in the above table. The prevailing vowel’s seat will be the seat of the hamza.
Sub-rule: If the main rule determines that hamza is to be seated on ʾalif, and there is a long ā vowel on the hamza using an ʾalif, then hamza shall be unseated. And the combination of ءَا will usually be written as آ.
Examples:
هَيْءَة hayʾaḧ
Vowel on consonant before hamza: ay
Shortened vowel on hamza: a
Precedence: ay
Seated:ءخَطِيءَة k͡haṭīʾaḧ
Vowel on consonant before hamza: ī
Shortened vowel on hamza: a
Precedence: ī
Seated:ءاسْتِيءَاس ʾistīʾās Vowel on consonant before hamza: ī
Shortened vowel on hamza: a
Precedence: ī
Seated:ء (Exception: ءَا is not written as آ when the preceding vowel is ī.)تَوْءَم tawʾam Vowel on consonant before hamza: aw
Shortened vowel on hamza: a
Precedence: aw
Seated:ءسَائِل sāʾil Vowel on consonant before hamza: ā
Shortened vowel on hamza: i
Precedence: i
Seated: ئتَسَاؤُل tasāʾul Vowel on consonant before hamza: ā
Shortened vowel on hamza: u
Precedence: u
Seated:ؤتَسَاءَلَ tasāʾala Vowel on consonant before hamza: ā
Shortened vowel on hamza: a
Precedence: ā
Seated:ءقِرَاءَات qirāʾāt Vowel on consonant before hamza: ā
Shortened vowel on hamza: a
Precedence: ā
Seated:ءمَسْؤُول masʾūl Vowel on consonant before hamza: ◌ْ
Shortened vowel on hamza: u
Precedence: u
Seated:ؤتَرْئِيس tarʾīs Vowel on consonant before hamza: ◌ْ
Shortened vowel on hamza: i
Precedence: i
Seated:ئمِرْآة mirʾāḧ Vowel on consonant before hamza: ◌ْ Shortened vowel on hamza: a
Precedence: a
Seated:ء (Using sub-rule.)ظَمْآن ḍ͡hamʾān Vowel on consonant before hamza: ◌ْ Shortened vowel on hamza: a
Precedence: a
Seated:ء (Using sub-rule.)مَسْأَلَة masʾalaḧ Vowel on consonant before hamza: ◌ْ Shortened vowel on hamza: a
Precedence: a
Seated:أالْمَرْأَة almarʾaḧ Vowel on consonant before hamza: ◌ْ Shortened vowel on hamza: a
Precedence: a
Seated:أ
This is the basic underlying principle although there are a few exceptions. We now give a comprehensive set of rules and many examples that should exemplify this principle and the exceptions.
3.3 Ordered set of rules
- If hamza occurs in the beginning of a word:
- If the hamza carries a long-ā vowel, it is written unseated followed by an ʾalif and written as آ, for example آمَنَ ʾāmana.
- If the hamza carries any other vowel, it is written seated on an ʾalif, and is marked with the appropriated vowel mark, for example أَسْلَمَ ʾaslama, أُرِيدُ ʾurīdu, إِسْلَام ʾislām, إِيمَان ʾīmān, أُوخِذَ ʾūk͡hid͡ha.
- If hamza occurs in the middle of a word:
- If there is a long vowel or diphthong before the hamza:
- If the hamza is after a long-ī vowel or ay diphthong, then the hamza will be written unseated. (Note: if the hamza has a fat·ḥaḧ and is followed by an ʾalif the combination of the two is usually not replaced by آ as is otherwise commonly done when the ʾalif is not part of a suffix and the hamza is not doubled.) Examples:
هَيْءَة hayʾaḧ, خَطِيءَة k͡haṭīʾaḧ
بَرِيءُونَ barīʾūna, بَرِيءَانِ barīʾāni, بَرِيءِينَ barīʾīna, بَرِيءَيْنِ barīʾayni
شَيْءُهُ s͡hayʾuhu, شَيْءَهُ s͡hayʾahu, شَيْءِهِ s͡hayʾihi, شَيْءَانِ s͡hayʾāni, شَيْءَيْنِ s͡hayʾayni
مَجِيءُهُ majīʾuhu, مَجِيءَهُ majīʾahu, مَجِيءِهِ majīʾihi
اسْتِيءَاس ʾistīʾās, اسْتِيءَار ʾistīʾār, اسْتِيءَال ʾistīʾāl. - If the hamza is after a long-ū vowel or aw diphthong, then:
- If the hamza has an kasraḧ it is written seated on yāʾ. Examples:
سُوئِهِ sūʾihi, ضَوْئِهِ ḍawʾihi
- Otherwise, the hamza is written unseated. (Note: if the hamza has a fat·ḥaḧ and is not doubled and is followed by an ʾalif which is not part of a suffix, then the combination of unseated hamza and ʾalif is written as آ.) Examples: سُوءَهُ sūʾahu, سُوءَانِ sūʾāni, تَوْءَم tawʾam, ضَوْءَهُ ḍawʾahu, ضَوْءَانِ ḍawʾāni, سُوءُهُ sūʾuhu, يَسُوءُونَ yasūʾūna, نُوآنٌ nūʾānun.
- If the hamza has an kasraḧ it is written seated on yāʾ. Examples:
سُوئِهِ sūʾihi, ضَوْئِهِ ḍawʾihi
- If the hamza is after a long ā-vowel, then:
- If the hamza has an kasraḧ it is written seated on yāʾ. Example: سَائِل sāʾil.
- If the hamza has an ḍammaḧ it is written seated on wāw. Example: تَسَاؤُل tasāʾul.
- Otherwise, when the hamza has an fat·ḥaḧ, it is written unseated. Example: تَسَاءَلَ tasāʾala, قِرَاءَات qirāʾāt.
- If the hamza has an kasraḧ it is written seated on yāʾ. Example: سَائِل sāʾil.
- If the hamza is after a long-ī vowel or ay diphthong, then the hamza will be written unseated. (Note: if the hamza has a fat·ḥaḧ and is followed by an ʾalif the combination of the two is usually not replaced by آ as is otherwise commonly done when the ʾalif is not part of a suffix and the hamza is not doubled.) Examples:
- If the letter before the hamza has a sukūn and is not wāw or yāʾ (in which case rule 2.a would apply), then:
- If the hamza was originally at the end of the word, but a suffix has been attached to the word such that the hamza is now in the middle of the word, then the hamza will be written unseated. Examples: جُزْءَانِ juzʾāni, عِبْءَانِ ɛibʾāni, عِبْءَيْنِ ɛibʾayni, بُطْءَهُ buṭʾahu, بُطْءُهُ buṭʾuhu, بُطْءِهِ buṭʾihi. (انِ, يْنِ, هُ, and هِ are suffixes).
- Otherwise, if the hamza is originally at the middle of the word, then:
- If the hamza has a ḍammaḧ it is written seated on wāw. Example: مَسْؤُول masʾūl.
- If the hamza has a kasraḧ it is written seated on yāʾ. Example: تَرْئِيس tarʾīs.
- If the hamza has an fat·ḥaḧ then:
- If it is followed by a long-ā vowel represented by an ʾalif, the hamza is unseated followed by the ʾalif and the combination is written as آ. Example: مِرْآة mirʾāḧ, ظَمْآن ḍ͡hamʾān.
- Otherwise, if there is no ʾalif after the hamza, the hamza is written seated on ʾalif. Examples: مَسْأَلَة masʾalaḧ, الْمَرْأَة almarʾaḧ.
- If it is followed by a long-ā vowel represented by an ʾalif, the hamza is unseated followed by the ʾalif and the combination is written as آ. Example: مِرْآة mirʾāḧ, ظَمْآن ḍ͡hamʾān.
- If the hamza has a sukūn, then look at the vowel mark on the letter preceding it:
- If the letter preceding hamza has a kasraḧ, the hamza is written seated on yāʾ. Example: بِئْسَ biʾsa.
- If the letter preceding hamza has a ḍammaḧ, the hamza is written seated on wāw. Example: سُؤْلَکَ suʾlaka.
- If the letter preceding hamza has a fat·ḥaḧ, the hamza is written seated on ʾalif. Example: کَأْس kaʾs.
- Otherwise, only if the above conditions are not satisfied, then compare the vowel marks of the hamza and the letter before it:
- If either vowel mark is an kasraḧ then the hamza will be written on a yāʾ. Examples: سُئِلَ suʾila, يَئِسَ yaʾisa, مُتَّکِئِينَ muttakiʾīna, رَئِيس raʾīs.
- If neither vowel mark is an kasraḧ, and at least one of the vowel marks is a ḍammaḧ, then the hamza will be written on a wāw. Examples: سُؤَال suʾāl, رُؤُوس ruʾūs, لُؤَيّ luʾayy.
- Otherwise, if both of the vowel marks are fat·ḥaḧs, then:
- If the hamza is followed by a long-ā vowel represented by an ʾalif, the hamza is written unseated. Examples: شَنَآن s͡hanaʾān.
- Otherwise the hamza will be written on an ʾalif. Examples: سَأَلَ saʾala, رَأَىٰ raʾā.
- If the hamza is followed by a long-ā vowel represented by an ʾalif, the hamza is written unseated. Examples: شَنَآن s͡hanaʾān.
- If there is a long vowel or diphthong before the hamza:
- If hamza is at the end of a word, disregard the vowel mark on it and consider only the letter before the hamza.
- If there is a long vowel (ā, ī, ū) or a diphthong (aw, ay) before it then the hamza will be written unseated. Examples: دُعَاءُ duɛāʾu, سُوءُ sūʾu, جِيءَ jīʾa, ضَوْءَ ḍawʾa, شَيْءَ s͡hayʾa.
- Otherwise, if the previous letter has a sukūn, the hamza will again be unseated. Examples: بُطْءُ buṭʾu, عِبْءُ ɛibʾu, شَطْءُ s͡haṭʾu
- Otherwise, if the previous letter is a doubled wāw with an ḍammaḧ, the hamza will again be unseated. Example تَبَوُّءُ tabawwuʾu.
- Otherwise, if the previous letter has an:
- kasraḧ, the hamza is written seated on yāʾ. Example يُهَدِّئُ yuhaddiʾu, سَيِّئُ sayyiʾu.
- ḍammaḧ, the hamza is written seated on wāw. Example بَطُؤَ baṭuʾa.
- fat·ḥaḧ, the hamza is written seated on ʾalif. Example يَهْدَأُ yahdaʾu, مُبْتَدَإِ mubtadaʾi.
3.4 Prefixes and suffixes
If hamza is in the beginning of a word, adding a prefix to the word will not alter the writing of the hamza. Examples:
لِ + أُسْتَاذِ = لِأُسْتَاذِ
الْ + آخِرَة = الْآخِرَةIf hamza is at the end of a word, adding a suffix to the word can, in general, alter the writing of the hamza, except in cases that have already been mentioned above. Examples:
مُبْتَدَأَ + انِ = مُبْتَدَءَانِ
دُعَاءُ + هُ = دُعَاؤُهُ
ضَوْءِ + هِ = ضَوْئِهِ
بُطْءُ + هُ = بُطْءُهُAs we mentioned earlier, when unseated hamza is followed by an ʾalif which is not a suffix: ءا, the combination of hamza and ʾalif is conventionally written as آ. However, if the unseated hamza is doubled or preceded by another ʾalif then it won’t be written as آ. Example: سَءَّال saʾʾāl, قِرَاءَات qirāʾāt.
4 tanwīn on final hamza
tanwīn on a final hamza does not affect the writing of the hamza except in the case of tanwīn al-fat·ḥ. When writing tanwīn al-fat·ḥ on a hamza at the end of a word:
If there is an ʾalif before a unseated hamza اء, then we don’t add a silent ʾalif when writing tanwīn al-fat·ḥ. For example دَاء becomes دَاءً dāʾan, not دَاءًا.
Otherwise, we add the silent ʾalif after the hamza so that the hamza is now in the middle of the word with a suffix ʾalif after it. We now pretend that the hamza has an fat·ḥaḧ and that the ʾalif after it is a long-ā vowel. Then we go through the rules for writing hamza in the middle of a word (given above) to determine how hamza will be written. We then write the an-mark on the hamza. Examples:
- مُبْتَدَأ becomes مُبْتَدَأٌ، مُبْتَدَءًا، مُبْتَدَإٍ
- مَلْجَأ becomes مَلْجَأٌ، مَلْجَءًا، مَلْجَإٍ
- جُزْء becomes جُزْءٌ، جُزْءًا، جُزْءٍ
- شَيْء becomes شَيْءٌ، شَيْءًا، شَيْءٍ
5 Variants
There are some historical and regional variants to the above rules. The main one is a variant of rule 2.b.ii above. In this variant, when the letter before hamza has a sukūn, the hamza is generally written unseated. So they write:
- مَسْءُول instead of مَسْؤُول
- أَسْءِلَة instead of أَسْئِلَة
- مَسْءَلَة instead of مَسْأَلَة
However, this rule appears to be not consistently followed. For example, al-nas͡hʾaḧ is generally always written النَّشْأَة never النَّشْءَة.
A second variant is to avoid the repetition of vowel letters like و and ي. So they write:
- رُءُوس instead of رُؤُوس.
- رَءِيس instead of رَئِيس.
6 Typographical limitations
Due to what appears to have been a limitation of typesetting technology in the days of typewriters, metal typography, and early digital typography, unseated hamza between two joining letters ـءـ was usually written as seated on yāʾ instead: ـئـ. Because of this limitation we are now accustomed to seeing:
- شَيْئًا instead of شَيْءًا
- خَطِيئَة instead of خَطِيءَة
- هَيْئَة instead of هَيْءَة
- عِبْئَيْنِ instead of عِبْءَيْنِ
and similar variants.
These variants have pervaded to such a degree that many modern explanations on the rules of hamza orthography present the above as the correct way of writing, and modify their rules with exceptions to allow this writing.
Fortunately, advancements in digital font technology now allow us to revert back to the original rules. However, unfortunately, only very few computer fonts today actually implement this feature.
Two fonts, of which we are aware, that do allow the preferred orthography are:
- Dr Khaled Hosny’s “Amiri”: https://www.amirifont.org/
- DecoType “Naskh”: https://www.decotype.com/oneliner/
With these “hamza-safe fonts” you can always use U+0621 “Arabic Letter Hamza” to type unseated hamza no matter whether it is between joining or non-joining letters. With most other fonts U+0621 “Arabic Letter Hamza” will prevent the surrounding two letters from joining.
This issue has been discussed in detail by Thomas Milo in Unicode L2/14-109: https://unicode.org/L2/L2014/14109-inline-chars.pdf.
However, the problem remains that this is a font-specific hack. It seems that the Unicode has no official guidance in this regard.
Another solution, that works with most other fonts, is to fake the correct orthography using a combination of U+0640 “Arabic Tatweel” and U+0654 “Arabic hamza above” thus ـٔ to achieve the appearance of unseated hamza between two joining letters. However, this will not work when hamza is between lām and ʾalif in the mandatory lam-ʾalif ligature لا (correct: لءا, incorrect: لـٔا). Such words are rare in Standard Arabic but do exist, e.g. لَءَّال laʾʾāl meaning “pearl-seller”, and مِلْءًا. Perhaps the official solution ought to be that fonts absorb the tatweel and not let it affect the lām-ʾalif ligature when it it is input between lām and ʾalif.
Beware that using tatweel in this manner may also affect the searching of characters in a digital document.